knitr::opts_chunk$set(echo = TRUE)

Herein I list some issues common in data retrieved from the WQP.

  1. Data structure
    • Different number of columns FIX: wqp_checkClasses
    • Column classes FIX: wqp_checkClasses
  2. Data contents
    • missing units FIX: wqp_checkUnits
    • inconsistently reported units, e.g. "mg/l", "mg/L", "ppm", "mg/l as N" FIX: wqp_checkUnits
    • Missing fraction FIX: wqp_checkFraction
    • missing detection limit FIX: wqp_checkBDL
    • missing/inconsistent collection method
    • missing/inconsistent analytical method
    • inconsitent reporting of bdls FIX: wqp_checkBDL
with(aluminumData, 
     summary(as.factor(ResultDetectionConditionText[is.na(ResultMeasureValue)])))

with(aluminumData, 
     summary(as.factor(ResultDetectionConditionText[!is.na(ResultMeasureValue)])))

So ALL NA's belong to BDL-reported values.

foo <- Reduce(`*`, lapply(wqpData[badrows,], is.na))
summary(foo)
setdiff(names(boundWQ), names(aluminumData))

sample(na.omit(boundWQ$valueType), 10)
unique(boundWQ$resultweightbasis)
sample(unique(boundWQ$resultparticlesizebasis), 10)

foo <- wqpData[stillbad2, ]

glimpse(foo)
sum(!is.na(foo$ResultMeasureValue))
unique(foo$ResultDetectionConditionText)

Checking BDLs

summary(as.factor(wqpData$ResultDetectionConditionText))
summary(as.factor(wqpData$DetectionQuantitationLimitTypeName))
summary(wqpData$DetectionQuantitationLimitMeasure.MeasureValue)

# Can I get detection limit from analytical method?

wqpData %>%
  filter(!is.na(DetectionQuantitationLimitMeasure.MeasureValue),
         !is.na(ResultAnalyticalMethod.MethodIdentifier)) %>% 
  # group_by(ResultAnalyticalMethod.MethodIdentifier, 
  #          ResultAnalyticalMethod.MethodIdentifierContext) %>% 
  transmute(analyticMethod = ResultAnalyticalMethod.MethodIdentifier, 
         analyticContext = ResultAnalyticalMethod.MethodIdentifierContext,
         detlim = DetectionQuantitationLimitMeasure.MeasureValue,
         detunits = DetectionQuantitationLimitMeasure.MeasureUnitCode) %>% 
  # ungroup() %>% 
  unique() %>% 
  arrange(analyticMethod, analyticContext) %>% 
  print(n = 40)

Bottom line: Can't infer detection limit from analytical method.

sample_n(wqpData[badrows, ], 10) %>% glimpse
sample_n(wqpData[goodrows, ], 10) %>% glimpse

wqpData %>% 
  filter(badrows) %>% 
  filter(ResultCommentText != "") %>% 
  glimpse

Does aluminumData have rows that are missing detlim info but that come from a dataset that does have such info?

aluminumData %>% 
  group_by(CharacteristicName,
        ResultSampleFractionText,
        MonitoringLocationIdentifier) %>% 


markwh/rcmodel documentation built on May 21, 2019, 12:26 p.m.